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(54) Method and apparatus lor video skimming 

(57) The present invention relates to a system for 
searching and browsing multimedia, and more particu- 
larly, to a video skimming method and apparatus which 
Is capable of fully understanding the full content of video 
within a short time and rapidly moving to a desired por- 
tion by skimming the content of She video based on 



scenes and shots formed by shot clustering and shot 
segmentation, selecting scenes to be reproduced and 
scenes to be skipped when performing video skimming, 
and then continuously reproducing a particular portion 
in a shot of the scene to be reproduced or partially re- 
producing the same by a skipping technique. 
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(54) Method and apparatus for video skimming 



(57) The present invention folates to a system for 
searching and browsing multimedia, and more particu- 
larly, to a video skimming method and apparatus which 
is capable off ally understanding the full content of video 
within a short time and rapidly moving to a desired por- 
tion by skimming the content of the video based on 



scenes and shots formed by shot clustering and shot 
segmentation selecting scenes to be reproduced and 
scenes be skipped when performing video skimming, 
and then continuously reproducing a particular portion 
in a shot of the scene to be reproduced or partially re- 
producing the same by a skipping technique. 
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Description 

BACKGROUND OF THE INVENTION 

1 . Field of the Invention 

j The present invention relates to a system for 
searching and browsing multimedia, and more particu- 
larly, to a video skimming system which is capable of 
briefly understanding the full content of video and rapidly 
moving to a desired portion based on a meaningful story 
structure according to the progress of theeontent of the 
video among structural information of the video content 

[0002] As mass media has developed and the produc- 
tion of multimedia contents has become easier, the 
quantity of media received by the genera! public every 
day has become enormous. As multimedia contents 
have become enormous, a request for an automatic sys- 
tem for sorting data desired by a user is generated and 
the study of methods for complying with this request is 
bei ng made. Particularly, with the development of digital 
technology, there is a growing trend in which a video 
content is stored and distributed in a digital format. 
When digital broadcasting: becomes popular, the digitai- 
ization of media will be accelerated. 
[0003] With such a digital video content, a certain user 
may wish to view only sports-related news, or another 
user may wish to view secuhti as-related news. In addi- 
tion, a certain tjse^ may request for viewing only scenes 
in which a particular person appears In a show program, 
In order to receive such various kinds of user requests, 
various studies are being made. 
[0004] Moreover, a user may request to grasp the full 
video contents within a limited time. Such a request is 
accepted by "Highlights". Generaiiy, highlights can be 
understood as a newly configured content of important 
scenes from a video content. These includes, for exam- 
ple, "Sports Highlights", "Preview of Movie". "Headline 
News" and the like. However, in current technologies, it 
Is very difficult to automate the extraction of highlights 
(ram a video content. Thus, in most cases, this extrac- 
tion Is dependent upon a manual work, As mentioned 
above, as the quantity of media has been increased ex- 
plosively, many human powers are needed to manually 
provide highlights of ©vary video content, which is al- 
most impossible, Therefore, an automation system is 
needed In order to allow a user to understand the outline 
of the content within a short time. 
[0005] With the development of digital technologies, 
a key frame is used for use In moving to a desired po- 
sition in a video content. By using a video summary us- 
ing the key frame, a user can move to a desired person 
rapidly. A large number of key frames are needed in or- 
der to easily search for a deseed portion by using the 
key frame, but it is difficult to display a large number of 



key frames in a restricted display space. Thus, the user 
is requested to perform many selection works. In addi- 
tion, generalry* it is difficult to understand the full content 
of video by the method using a key frame. 

5 [000$] Recently; for searching for a desired scene in 
a digital video, various video indexing techniques are 
being studied. For a user wanting only scenes in which 
a particular person appears, the study of indexing infor- 
mation on the appearance of a person by the process 

to of searching for a scene in which £he person appears in 
a video and recognizing who t he person is an d the study 
of extracting principal scenes from a movie or sports and 
indexing the same are being made, However: the gen- 
res of video are very various an d data to be in dexed are 

1$ very different by genres. Hence, it is known that It is van/ 
difficult to implement an automation system for extract- 
ing meaningful information with accuracy of high level 
by the current techniques. 

[0007] On the other hand, fa digital video, unlike an- 
2$ aiog video, the degradation of image quality can be pre 
vented when fast wind/fast rewind functions are execut- 
ed, 

[0008] As a fast reproduction method generally used 
in a digital video, a method for increasing a number of 

£5 frames decoded per tmfr time and displaying parts there- 
of, era method tor decoding and displaying frames while 
skipping a certain portion is used, 
[0009] However* in the method for increasing the 
number of frames decoded per unft rime, it is dtsadvan- 

30 taqeous in that the maximum speed Is affected by the 
performance of a terminal device, Thus, for the fast 
wind/fast rewind of a digital video, the method for de- 
coding and displaying frames while skipping a certain 
portion is used in general. The fast wind/fast rewind 

55 technique In the digital video is the most reasonable one 
of existing techniques for complying wfth the request of 
the user wanting to understand the full content within a 
restricted time or wanting to move to a desired portion, 
However ; predetermined intervals of time are used in 

*o skipping a certain portion, and thus there is a disadvan- 
tage that the user misses the scene of a desired portion 
or a less important portion is reproduced relatively often, 

SUMMARY OF THE INVENTION 

45 

[0010] A video skimming method according to the 
present invention includes the steps, of: recognteing an 
individual shot section, a physical editing unit, as struc- 
tural information for a video stream by shot segmenta- 
ls tion; selecting a particular portion in the recognized In- 
dividual shot section as reproduction video information 
reflecting the content of the corresponding shot; and 
contin uousiy reproducing the video information selected 
for the Individual shot. 

[0011] Here, the video skimming method further *n- 
eludes the shot selection step of determining shots to 
be reproduced and s hots to be skipped after recognizing 
the individual shot section. 
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[001 2] Here ; the structural Information m the video 
stream are scene information, I.e.. a logical story unit, 
and shot Information:, Le, s a physical editing unit dis- 
played together mih temporal technical information 
{starting position and duration or starting posiiion and 
end position), which further including technical informa- 
tion on shot properties, 

[001 3J Here, m the shot selection step, the effect of 
repetitively reproducing shots having similar properties 
is minimized by determining to skip parts of the shots 
having similar properties and to use cnty the remaining 
shots for skimming, 

[0014j Here, in selecting shots to be reproduced from 
the similar shots, shots to be used tor skimming are se- 
lected by giving a higher weight value for selection to 
shots located at the latter half of a scene, 
[001 53 Here, as a reproduction portion (segment) rep- 
resentative of each shot, the front portion, roar portion 
and center portion of the corresponding shot or the front 
portion and rear portion thereof are used at the same 
time, The length of the reproduction portion (segment) 
representative of each shot is identical 
[0016] Here, if the length of the segment selected as 
the reproduction portion in each shot Is larger than the 
length of the corresponding snot the length of the re- 
production portion in the individual shot is reduced to 
below the length of the corresponding shot. 
[0017] Here, based on the average value of the tm- 
age/motion/^ndio similarities in the individual shot, the 
length of the reproduction portion (segment) represent* 
ative of each shot is reduced if there is a high similarity, 
or the reproduction length is increased if there Is a low 
similarity, 

[00183 Then> the image/motion/audio similarities in 
the shot representative of the scene mean the similarity 
in frames, motion vectors and audio data with different 
time positions. 

[00193 Here, the reproduction speed of segments to 
be reproduced in each shot \$ controlled variably, 
[0020] Here, the reproduction section is reproduced 
at a high speed by increasing a number of frames to be 
decoded per unit time and then making the reproduction 
speed higher than a normal speed, or by skipping a few 
frames in the middle without decoding all frames in the 
reproduction section. 

[00213 Here, when the high speed skimming method 
using skipping is adapted to a video stream using a cod- 
ing scheme utilizing interframe compression such as 
MPEG, decoding frames are t frames which can obtain 
frame data by decoding only the corresponding frame 
without decoding other frames. 
[0022] in addition, a video skimming apparatus ac- 
cording to the present invention includes: a user inter- 
face unit for inputting a user command for video skim- 
ming In order to search and browse digital video data as 
multimedia data; a control unit for skimming the corre- 
sponding video file based on structural Information tor 
video content according to the user command inputted 



from the user interlace unit: a video information file for 
providing the structural information forthe video content 
to the control unit as index information for the digital vid- 
eo data and the corresponding video: and a display unit 

s for reproducing the video skimmed by the control unit 
[0023] Here, the structural information for the video 
content includes shots representative of the corre- 
sponding scene, a logical story unit, that are based on 
scene information and are reproduced from shot, intor- 

w matiom a physical editing unit which is an element of the 
scene. 

[0024] Here, the structural information for the video 
content f urther includes segments to be reproduced in 
the shot representative of the corresponding scene. 

*5 [0025] Here, the user interface unit includes a unit tor 
designating a summary level as a deg ree of video skim- 
ming or a unit for designating the speed of a reproduc- 
tion section in video skimming in order to select the sum- 
mary levei or reproduction speed of video in video sklrn- 

2$ mlng, 

[0636J Here, the control unit reads video index Infor- 
mation related on shot segmentation information and 
shot clustering Information from the Index file according 
to a skimming condition by using a user input or basic 
£5 settings, calculates segments to be reproduced con- 
forming to the video skimming condition ; reproduces the 
corresponding segments from the related media file 
continuously and outputs the same to the display unit, 

30 BRIEF DESCRIPTION OF THE DRAWINGS 

[0027] The above objects, features and advantages 
of the present Invention will become more apparentfrom 
the following detailed description when taken in com 
55 junction with the accompanying drawings, in which: 

Figure 1 is a view explaining the concept of shot 
segmentation and clustering: 
Figure 2 is a view explaining the concept of a video 
40 skimming method using shot segmentation infor- 
mation; 

Figure 8 m a view illustrating an example of a meth- 
od of transition of shots of m Interactive scene: 
Figure 4 Is a view Illustrating an example of a scene 
45 detection method using shot properties: 

Figure 5 is a view illustrating an example of a meth- 
od for selecting shots to be reproduced and shots 
to be skipped In skimming using structural Informa- 
tion: 

53 Figure 8 Is a view explaining a method for selecting 
shots to be reproduced and shots to be skipped in 
consideration of the location of shots in a scene and 
repetitive information; 

Figure 7 Is a view explaining a method forseiecting 
a portion to be skipped and a portion to be repro- 
duced in a shot; 

Figure 8 is a view Illustrating a method f or selecting 
a dynamic unit reproduction length using the dis- 
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similarity of a shot; 

Figure 9 is a view explaining a quick skimming 
mettled using skipping; 

Figure 10 is a view explaining a skimming method 
using structural information of video content; and 
Figaro 11 tea view illustrating one example of the 
configuration of a system for video skimming using 
structural information of video content 

DETAILED DESCRIPTION OF THE PREFERRED 
EMBODIMENT 

[0028] Wrth the development of digital video tech- 
niques and imago/video recognition techniques, users 
has become to search/filter and browse only a desired 
portion of a desire video at a desired point of time. 
[0029] The most basic techniques for non-linear video 
browsing and searching are the shot segmentation tech- 
nique and the shot clustering technique, These two 
techniques are the most essential ones for analyzing 
video. Therefore, many studies have been concentrated 
on shot segmentation up to now, and the study of the 
shot clustering technique is being started, 
[0030] Based on the results of various studies, the 
shot segmentation can be automated, and most algo- 
rithms can be implemented with a high accuracy of more 
than 90%. 

[0031] In addition, the shot clustering also can be au^ 
tomated with accuracy of high level by applying the tech- 
nique conforming to the genre of a. program by detecting 
a characteristic event or using general characteristics of 
shots. 

[0032] Generally, a video contents is logically seg- 
mented into a several number of story units. Such a unit 
of a story structure is generally referred to as an event 
or scene, which including a gunf ighi scene, an interac- 
tive scene, etc. Such a scene is constructed a sequence 
of sub-scenes or shots 

[0033] A shot denotes a sequence of video frames ob- 
tained from one camera without interruption; which is 
the most basic unit in video analysis or construction. 
[0034] Generally a video is constructed of a se- 
quence of many shots. Shot segmentation denotes a 
method for segmenting video into individual shots, Shot 
clustering denotes a process for detecting a logical story 
structure of a video content by reconstructing shots in 
logical scene units based on each of the individual shots 
and the characteristics thereof. 
[0035] The thualy configured video skimming system 
using scene and shot information,: U. ? structural infor- 
mation of video content according to the present inven- 
tion will now be described with reference to the accom- 
panying drawings, 

[0036] Figure 1 is a view illustrating a shot: segmen- 
tation process and a shot clustering process. Generally, 
most shot segmentation algorithms are based on the 
feature that image/motion/audio similarity is present In 
the same shots and the image/motion/audio dissimilar- 



ity is found between two different shots, and most shot 
clustering algorithms are based on the featu re that shots 
having similar characteristics are detected again within 
a predetermined tims, 

5 [0037] Generally, video highlights are a method for 
selecting meaningful segments in the progress of the 
content of a video stream and continuously reproducing 
these segments. 

[0038] However, it is very difficult to automate the se- 
f 0 lection of meaningful segments in the progress ot vari- 
ous video contents. 

[0039] Nevertheless, if shot segmentation information 
is used for video skimming, it Is possible to implement 
a skimming method for reproducing only a certain por- 
ts Won of each of shots existing in every video and repro- 
ducing the remaining portion at a length smaller than 
that of the original stream by skipping. Such a skimming 
method is advantageous in that a complete automation 
system can be constructed since the shot segmentation 
2$ technique can be automated, and in that the problem of 
reproducing an unimportant scene at a large length or 
missing an important scene generated in the fast wind/ 
fast rewind for general digital video can be reduced. 
[0040J Figure 2 is a view of a summary of a video 
£5 akimmi ng method using shot segmentation information, 
[004t] A portion shown m gray in Figure 2 indicates a 
portion to be reproduced in the skimming method using 
shot segmentation information, and the remaining por- 
tion indicates a portion to be skipped. 
30 [0042] However, in the case that only the shot seg- 
mentation information is used in video skimming, scene 
information : which is a logical story structure existing in 
video content, is not used, and therefore repetitive shots 
continue to be played in a particular event section such 

6 as an interactive scene. 

[0043] Figure 3 is a view illustrating an array structure 
of shots in along interactive scene. In Figures, the shots 
each are represented as English capital letters (A, 8 : C f 
D) based on shot properties detected by the shot seg~ 

w mentation process. 

[0044] in other words, the interactive scene repre- 
sented in Figure 3 is a scene in which character 1 and 
character 2 are viewed in close-up in tu rns, which being 
constructed af many shots. 

45 [004S] However, if the only shot segmentation infor- 
mation is used in video skimming, every certain portion 
of each of ihe shots in the interactive scene is repro- 
duced, Therefore, there is a disadvantage that this 
scene Is reproduced tor a long time though no other ad- 

so ditional information excepting the information that two 
persons taik can be provided to a user. 
[0046] In the present Invention, the above-mentioned 
disadvantage is overcome by performing video skim- 
ming in consideration of shot information as well as 
scene information as structural information of a video 
content. 

[0047] in other words, in the present invention, there 
is suggested a skimming method and apparatus for 
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picking out shots to be reproduced and shots to be 
skipped from shots of each scene existing in every vid- 
eo, reproducing oniy a certain portion (segment) from 
segment information constituting ihe shot to be repro- 
duced and reproducing the remaining portion at a iength 
smaller than that of the original video stream by skip- 
ping, 

[0O4&J As the results of various studies, it is known 
[hat a scene of content such as a movie or drama can 
be detected dependent upon the fact that a particular 
event such as a gunfight scene, interactive scene, etc. 
can be detected, and thusly an Index structure of a ToC 
(Tabie of Content) format can be automatically generat- 
ed. 

[0049] Figure 4 is a view illustrating a process of de- 
tecting a story unit for a general video content, 
[0050] Each shot is represented as an English capital 
letter based on sh ot properties detected by the shot seg- 
mentation process like Figure 3. In a shot transition 
structure in an interactive scene of a drama or movie, in 
most eases, a feature pattern of shots such m A, B : A, 
8 ; ,.Js shown. Figure 4 shows the process of determin- 
ing the corresponding section as one scene if shots hav- 
ing similar properties are detected within a predeter- 
mined period of time, in Figure 4, scene 1 consists of 
shots having a feature value at A : B ? C. Shots having a 
feature value of A, B ; C do not exist for a predetermined 
time since sftott -B3, and thus a scene is detected by 
detecting the finish point of time of shoti-B3 to be the 
finish point of time of scene Un Figure 4 r scene 2 con- 
sists of shots having a feature value of F, H, £. The fea- 
ture values F f H, E of the shots do net exist for a prede- 
termined time since the last shot of this scene, and thus 
the finish point of lime of scene 2 can be detected. 
[0051] Besides this method, it is possible to detect a 
more accurate interactive scene by the process of face 
detection and face recognition, Such a method is usually 
adapted to genera!! dramas or movies, 
[0052] As described above, the present invention im- 
plements video skimming by using scene and shot in- 
formation which are structural information for video con- 
tent, and considers how to select a shot to be repro- 
duced from shots of a scene, how to select a portion to 
be reproduced and a portion to be skipped from the shot 
se footed as the shot to be reproduced, how to select a 
reproduction length of the portion io be reproduced and 
how to reproduce in a reproduction section. 
[0053] Firstly, Figure 5 is a view illustrating a summary 
of the video skimming method of the present invention. 
[0054] in Figures* structural information of video con- 
tent indexed by the shot segmentation process and the 
shot clustering process Is used, in Figure 5, shots se- 
lected for reproduction in video skimming using struc- 
tural information are indicated in gray and shots to be 
skipped are mdicated in white. That is, for the video 
skimming using structural information, a system firstly 
determines shots to be reproduced for each scene and 
determines method for reproducing the individual shot. 



[0055] Figure 5 is an example of reproducing only a 
remarkable scene oniy once among similar shots so that 
repetitive shots among shots of scene 1 cannot be re- 
produced. 

s [005$] in the present invention, the shot selection for 
determining shots to be reproduced and shots to be 
skipped among shots of each scene exiting in a video 
stream will be achieved as follows. 
[0057] in a method for selecting a representative shot 

to if many shots having similar properties exist in one 
scene, the outline of the content of the scene can be 
delivered by selecting the representat ive shot and using 
the same in skimming without any particular weight con- 
ditions. However, in the story structure such as general 

*5 dramas and movies, much more information is ex- 
pressed in the latter naif of one scene. In other words, 
the introduction part is usually less important than the 
conclusion part, Therefore, in the step of selecting shots 
to be reproduced in skimming when similar shots appea r 

2$ many times In the scene, much more information can be 
provided to a user by selecting shots in the latter half of 
the scene as shots to be reproduced. 
[0058] Figure 8 illustrates a method {a of Figure 6) for 
selecting shots to be reproduced In the former half o? a 

£5 scene and a method (b of figure 8} for selecting shots 
to be reproduced m the latter half of the scene. 
[0059] Both a and b of Figure 6 are examples of se- 
lecting only one shot for skimming it similar shots exist 
in one scene in a of Figure 8, shots appearing at the 

30 very beginning are selected as shots to be reproduced 
among shots having shot properties of A : 8.. C, In b of 
Figure 6, shots appearing ai the very last are selected 
as shots to be reproduced among shots having shot 
properties of A, B, C. Generally, the method of b of Rg- 

55 ure 6 shows a higher user's satisfaction than the method 
of a of Figure 8.. 

[0080] Next, the method for selecting a portion to be 
reproduced and a portion to be skipped in each shot will 
now be described. 

*Q [0061] in skimming using structural information of vid- 
eo content, the summary of the video content can be 
provided by continuously reproducing the shots select- 
ed above. However, the video skimming method of play- 
ing the full shot provides a summary of a very low level 

*s in general. Usuaity a user can understand the content 
of the full shot by viewing only parts of the shot. In the 
method for selecting a portion to be reproduced from the 
shot selected for reproduction in video skimming using 
structural information of video content the front portion, 

53 rear portion or center portion oftheshotcan be selected 
unconditionally, Figure 7 Is a view Illustrating a portion 
to be skipped and a portion to be reproduced in the 
method for video skimming using the front, rear and 
center portions or the front/rear portions of a shot at the 

$$ same time. 

[0062] As the result of a test. It is found that a higher 
user's satisfaction Is achieved by skipping the front por- 
tion of the corresponding shot and reproducing the rear 
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portion thereof, though it lis different according to the 
genre of video, The reasons of which arc because the 
conclusion part (e.g N . a goal scene in a soccer game) of 
the shot is more important than the introduction part or 
development part when understanding the content of 
the shot, and because parts of the content are ex- 
pressed in the former half of the shot and the full content 
is expressed in the latter half tf a method such as a step- 
wise chart explanation is. sued in a program like news, 
[00$3| However, the front portion of the shot may be 
important according to the genre of video, for example, 
educational broadcasting mainly for solving questions, 
£00843 ln such a broadcasting program, the informa- 
tion on what questions are dealt with is present at the 
front portion of the shot, and then the work of solving 
questions is continued since the front portion. Thus, in 
order to reproduce a desired portion;, much more infor- 
mation can be provided to a user by reproducing the 
front portion of the shot, rather than by reproducing the 
rear portion. 

£0065] Therefore, in the present invention, the posi- 
tion to be reproduced in the shot can be selected differ- 
ently according to the characteristics of the content of 
video, and skimming can be implemented by using the 
front portion, the center portion and the rear portion in 
combination with one another in the same shot. 
[0086J Next, the method for selecting a reproduction 
length according to the present invention will now be de^ 

[00S7] The method tor selecting a reproduction length 
in each shot can be divided info the method for selecting 
segments of the same length as a portion to be repro- 
duced for every selected shots and the method lor se- 
lecting a different reproduction length for each shot by 
using a shot property, 

[0088] At this time, the above-used shot property is 
based on the average ifnage/motion/aud^o similarities 
in one shot. That is, \i can be judged that the larger the 
image/motion/audio similarities in one shot are, the 
more monotonous the scene is, IN such a scene, skip- 
ping is performed more often . On the contrary;, it can be 
judged iftalthe smaller the irnage/moiion/audio similar- 
ities in the shot are., the mere complicated the content 
of the scene is. In such a scene, the length of a unit 
segment to be reproduced can be adjusted dynamically 
by using the method for performing skipping less often. 
[0069] This method a method for skipping a portion 
with much information less often and skipping a portion 
With a little information more often without depending 
upon the time length of the shot. By this method,: video 
skimming with a user's comprehensibly of a high level 
can be provided as compared to the method for repro- 
ducing segments of the same length for every selected 
shots. 

[0070] Figure 8 is a view illustrating an example of a 
method for selecting a length to be reproduced and 
skipped based on image/motion/audio similarities in a 
shot. 



[0071] In a graph of Figure 8, a horizontal axis indi- 
cates time and a longitudinal axis indicates an accumu- 
lated value of toiage/motion/audio dissimilarities In the 
shot. These dissimilarity data are data representing shot 
s properties extractabie f rom a shot segmentation algo- 
rithm in general. 

£0072) As an example of dissimilarity the difference 
in a color histogram variance between adjacent frames 
or between frames at predetermined intervals can be 
w taken. 

[0073] in Figure B, since the average rate of change 
of shot A is larger than that of shot B, though both shot 
A and shot B have a similar length, the circumstance \n 
which more portions are reproduced in shot B than shot 

1$ A is shown. 

[0074] in this way unless the length of a shot is con- 
sidered in setting a reproduction section, an error situ- 
ation in which the length of a reproduction section be- 
comes larger than that of the corresponding shot may 

20 occur (if the shot is very short). Hence, in the skimming 
method of the present invention, m the case that the 
length of a unit section becomes larger than that of the 
corresponding shot exceptionally, the full correspond- 
ing shot may be selected as a reproduction section or 

£5 parts thereof may be selected as a reproduction section 
in consideration of the length of the corresponding shot. 
[0075 J Next, the method for reproducing a scene and 
a reproduction section In a shot to be reproduced in the 
scene as structural information for video content wilt be 

30 explained. 

[0076] The video skimming method according to the 
present invention can be adapted to a backward direc- 
tion as well as a forward direction. 
[0077] When segments selected as reproduction sec- 

55 tions In each shot are continuously reproduced, a user 
can understand the full content and obtain outline infor- 
mation on content In a short time, Besides, any interfer- 
ence is not required for searching a desired position. 
[0078] in the video skimming method of the present 

*Q invention f the method for reproducing segments select- 
ed as reproduction sections in each shot can be divided 
into two, 

[0070] A first method Is one for reproducing each seg- 
ment, which is the same as a norma;: reproducing moth- 
od. A second method is one for decoding parts of frames 
in a reproduction section and reproducing the same in 
the section by using skipping. 
[0000] The normal reproducing method Is very com- 
mon ; so a detailed description thereof m\\ be omitted, 
53 The method for decoding parts of frames in a reproduc- 
tion section and reproducing the same in the section by 
using skipping will now be described. 
£0081] The method for decoding parts of frames in a 
reproduction section and reproducing the same in the 
section by using skipping is a method for implementing 
quick skimming. At this time, frames to be displayed can 
be designated as frames at predetermined intervals of 
time, in the method using interframe compression such 
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as MPEG. I frames having no interframe Independency 
cm be designated 

[0082] Figure 9 m a view illustrating an example of a 
q u ick skimmi n g method using skipping In a reproduction 
section. By using this method, a user can experience 
the effect of obtaining much information and reproduc- 
ing a video file at a high speed, 
[00B3J As expi&med above. In the video skimming 
method using structural information of video content of 
the present invention , segments are designated by two 
steps. Figure 10 is a view illustrating a summary of the 
video skimming method using structural information of 
video content according to the present invention. 
[00S4] When video skimming is requested, the sys- 
tem loads an Index file storing siructurai Information of 
video content including shot and scene information on 
the video content. The system determines what shots 
to reproduce for each scene and what shots to skip (in 
the shot selection stop), and determines segments to 
be reproduced and segments to be skipped for each 
scene selected for video skimming (in the segment des- 
ignation step). Through the two determination steps, 
segments to be reproduced are continuously outputted 
to a reproducing apparatus. 
[0085] Figure 10 is a view illustrating shots to be re- 
produced by the shot selection step shown In gray, in 
which only pails (segments) of a selected she! can be 
reproduced and the remaining portion can be skipped. 
[0086] Figure 11 is a view illustrating a skimming ap- 
paratus for video skimming according to one embodi- 
ment of the present Invention, 
[0087] As illustrated in Figure 11 , the video skimming 
apparatus of the present invention Includes a user Inter- 
face unit 101 for inputting a user command such as a 
degree of video skimming and a speed to be used in 
skimming, a master control unit 1 02 for skimming a cor- 
responding video file based on indexing information on 
shots and scenes according to the user command Input- 
ted into the user interface unit 1 01 , a media tile 1 03 for 
providing digital video stream information to the master 
control unit 102, an index file 104 tor providing the in- 
dexing information on shots and scenes as structural in- 
formation corresponding to the media file, and a display 
device unit 106 for reproducing the video skimmed by 
the master control unit 102, 

[0088] to toe video skimming system of the present 
invention of Figure 11 : the index file 104 can be included 
in the medial file 1 03, The display device unit 1 05 is an 
output device for displaying a video stream including a 
monitor, a speaker, etc. The user interface unit 101 is 
an inputting means for receiving an input of a user in- 
cluding a keyboard, a mouse, a remote control, buttons, 
etc, 

[0089] The media f He i 03 is aisle storing video (audio) 
data, and the index file 104 Is a fM storing index infor- 
mation on video containing shot clustering information 
and shot segmentation information. 
[0000] The user requests tor video skimming by using 



the user Interface unit 101 . 

[0091] When the video summing is f &q uestsd , a sum- 
mary lev©; (degree of skimming) can be designated and 
also a speed to be used in skimming ca n be designated. 

s That Is, the user designates how many minutes it takes 
to compress the full video for viewing by using the user 
interface unit 101, The master control unit 102 deter- 
mines what portion of what shot will be reproduced for 
skimming based on the medial file 103 and the sufcse- 

f 0 quent information of the index file 1 02 according to the 
input of the user and determines at what speed each 
segment will be reproduced. By completing this proc- 
ess, the master control unit 102 can provides a video 
skimming function to the user by decoding the media file 

*5 1 03 and dispfay ing the corresponding frames on the dis- 
play device unit 105, 

[0092] As described above, the present invention has 
disclosed a video skimming method for simultaneously 
complying with a user request for understanding ihe full 

20 content and moving to a desired position within a re - 
stricted time under a digital video environment, 
[0093] in the present invention, the possibility of re- 
producing a leas important portion relatively often or 
missing an actuary desired scene is minimized and the 

£5 possibility of repetitively reproducing an interactive 
scone or a particular scene in turns is minimized, which 
are the problems that can occur to the existing video 
skimming method. 

[0094] The video skimming method of the present av 
vention is a method for minimizing the necessity of a. 
user input according to a user request for moving to a 
desired position 

[0095] By using the video summing function of the 
present invention, the user can understand the lull con- 
55 tent within a short time, cannot miss an important portion 
in understanding the full content, and can skip a boring 
portion easily. 

[0096] in addition, the user can use the video skim- 
ming method of the present invention when he or she 

40 wants to move to a desired position. This method is ad- 
vantageous in that it requires less user input requests 
m compared to the method using key frames, 
[0097] in conclusion, the present Invention can be 
employed, for instance, for use in reproducing video 

^ highlights, and can be utilized as a function of rapldty 
searching a desired scene while minimizing a user \npui 
request if it is used togethe r with a high speed reproduc- 
ing method in reproducing reprodyctfonsectionsofeach 
shot. 

50 

Claims 

1 A video skimming method, comprising the steps of: 

recognizing an individual shot section, a phys- 
ical editing unit, as structural information for a 
video stream by shot segmentation; 
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selecting a particular portion in the recognized 
Individual shot section as reproduction video in- 
formation reflecting the content of the corre- 
sponding shot; and 

continuously reproducing the video information 
selected for she individual shot 

2, The method of claim 1 , wherein the video skimming 
method further includes the shot selection step of 
determining shots to be reproduced and shots to be 
skipped after recognizing the Individual shot sec- 
tion, 

3, The method of claim 1 , wherein the structural infor- 
mation on the video stream are scene Information, 
i.e., a logical story unit, and shot information, le. : a 
physical editing unit displayed together with tempo- 
ral technical information (starting position and du- 
ration or starting position and end position), which 
further including technical information on shot prop- 
erties. 

4s The method of claim 2, wherein , in the shot selec- 
tion step, the effect of repetitively reproducing shots 
having similar properties is minimized by determin- 
ing to skip parts of the shots having similar proper- 
ties and to use only the remaining shots for skim- 
ming. 

5> The method of claim 4, wherein, in selecting shots 
to be reproduced from the similar shots, shots to be 
used for skimming are selected by giving a higher 
weight value for selection to shots located at the lat- 
ter half of a scene. 

6, The method of claim 1 , wherein, as a reproduction 
portion (segment) representative of each shot, the 
front portion, rear portion and center portion of the 
corresponding shot or the front portion and mar por- 
tion thereof are used at the same time, 

7, The method of claim 1 .. wherein the length of the 
reproduction portion (segment) representative of 
each shot is identical. 

& The method of claim 7 S wherein, if the length of the 
segment selected as the reproduction portion in 
each shot is larger than the length of the corre- 
sponding shot the length of the reproduction por- 
tion in the individual shot is reduced to below the 
length of the corresponding shot. 

9, The method of claim 1 ( wherein, based on the av- 
erage value of the image/motion/audlo similarities 
in the individual shot, the length of the reproduction 
portion (segment) representative of each shot is re- 
duced if there is a high similarity, orthe reproduction 
length is increased if there is a low similarity, 



10, The method of claim 8 ? wherein the image/motion/ 
audio similarities In the shot representative of the 
scene mean the similarity in frames, motion vectors 
and audio data with different time positions, 

5 

11 , The method of claim 9. wherein it the length of the 
segment selected as the reproduction portion in 
each shot is larger than the length of the corre- 
sponding shot, the length of the reproduction per- 

to tion in the individual shot is reduced to below the 
length of the corresponding shot. 

12, The method of claim 1 , wherein the reproduction 
speed of segments to be reproduced in each shot 

15 is controlled variably 

13, The method o! claim 12. wherein the reproduction 
section is reproduced at a high speed by increasing 
a n umber of frames to be decoded per unit time and 

50 then making the reproduction speed higher than a 
normal speed, 

14, The method of claim 13, wherein the reproduction 
section is reproduced at a high speed by skipping 

£5 a few frames in the middle without decoding ail 
frames in the reproduction section. 

15, The method of claim 14. : wherein, when the high 
speed skimming method using skipping is adapted 

30 to a video stream using a coding scheme utilizing 
interframe compression such as MPEG, decoding 
frames are i frames which can obtain frame data by 
decoding only the corresponding frame without de- 
coding other frames. 

16, A video skimming apparatus, comprising: 

a user interface unit for inputting a user com- 
rnand for video skimming m order to search and 

40 browse digital video data as multimedia data: 

a control unit for skimming the corresponding 
video file based on structural Information for 
video content according to the user command 
inputted f rom the user interface unit; 

45 a video information file for providing the struc- 

tural information for the video content to the 
control unit as Index information for the digital 
video data and the corresponding video; and 
a display unit for reproducing the video 

53 skimmed by the control unit 

17, The apparatus of claim 16, wherein the structural 
information f orthe video content .includes shots rep- 
resentatlve of the corresponding scene, a logical 
story unit, that are based on scene information and 
are reproduced from shot information, a physical 
editing unit which is an element of the scene. 
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18. The apparatus of claim 17, wherein the structural 
information for the video content further includes 
segments to be reproduced in the shot representa- 
tive of the corresponding scene. 

5 

19. The apparatus of c\mm 16, wherein, the user Enter- 
face u n it comprises a unit tor designating a summa- 
ry level as a degree of video skimming or a unit for 
designating the speed of a reproduction section in 
video skimming in order to select the summary ievel to 
or reproduction speed of video in video skimmi ng. 

20. The apparatus of claim 16, wherein the control unit 
reads video index information related on shot seg- 
mentation tofomtation and shot clustering informs- *5 
lion from the index file according to a skimming con- 
dition by using a user input or basic settings; calcu- 
lates segments to be reproduced conforming to the 
video skimming condition, reproduces the corre- 
sponding segments from the misled media file con- 20 
tenuously, and outputs the same to the display unit, 

21 . A video skimming apparatus, comprising; 

a storage unit for storing digital video data, 2$ 
scene information which & logical story unit of 
a video content: and shot information which is a 
physical editing unit of a video content; 
a dejection unit for detecting shot information 
representative of a particular scene based on 
the scene information corresponding to the vid- 
eo data for video skimming: 
a selection unit for selecting segments to be re- 
produced and segments to be skipped in the 
detected shot; and & 
a reproduction unit for continuously reading the 
selected segments to be reproduced from the 
storage unit and reproducing the same. 
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